EIGENVALUE ESTIMATES FOR NON-NORMAL 

MATRICES AND THE ZEROS OF RANDOM 
ORTHOGONAL POLYNOMIALS ON THE UNIT 

CIRCLE 

E. B. DAVIES 1 AND BARRY SIMON 2 

Abstract. We prove that for any n x n matrix, A, and z with 
\z\ > || we have that ||(^ — -4) _1 1| < cot(^)dist(z, spec(A))- 1 . 
We apply this result to the study of random orthogonal polynomi- 
als on the unit circle. 



1. Introduction 

This paper concerns a sharp bound on the approximation of eigen- 
values of general non-normal matrices that we found in a study of the 
zeros of orthogonal polynomials. We begin with a brief discussion of 
the motivating problem, which we return to in Section [7| 

Given a probability measure dfi on C with 

J\z\ n dii{z) <oo (1.1) 

we define the monic orthogonal polynomials, Q n (z), by 

$ n (z) = z n + lower order (1.2) 

J ~z* $ n (z) dfi(z) = j = 0,1,..., n- 1 (1.3) 

If 

P n = orthogonal projection in L 2 (C, djj) ^ 
onto polynomials of degree n — 1 or less 

then 

$ n = (1 - P n )z n (1.5) 
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A key role is played by the operator 

4 = P n M z P n \ Ran(P n ) (1.6) 

where M z is the operator of multiplication by z and 4 is an operator 
on the n-dimensional space Ran(P n ). 

If zo is a zero of $„(z) of order k, then f ZQ = (z — zo)~ k Q n (z) is in 
Ran(P n ) and 

(4 - z ) k f Z0 =0 (A - z ) k ~ l f Z0 ^ (1.7) 

which implies 

$ n (z) = det(z - A n ) (1.8) 

Also, $n(z) is the minimal polynomial for A n . 

In the study of orthogonal polynomials on the real line (OPRL), a 
key role is played by the fact that for any y e Ran(P n ) with ||y||i, 2 = 1, 

dist(z , {zeros of $„}) < \\(A n - z )y\\ (OPRL case) (1.9) 

This holds because, in the OPRL case, A n is self-adjoint. Indeed, for 
any normal operator, B, (throughout || • || is a Hilbert space norm; for 
n x n matrices, the usual matrix norm induced by the Euclidean inner 
product) 

dist(^ ,spec(P)) = IKP-^o)- 1 !)- 1 (1.10) 

and, of course, for any invertible operator C, 

inf{||^||||| 2 /|| = l} = ||C- 1 ||- 1 (1.11) 

We were motivated by seeking a replacement of (jl.9j) in a case where 
A n is non-normal. Indeed, we had a specific situation of orthogonal 
polynomials on the unit circle (OPUC; see |TT1 ITS] ) where one has a 
sequence z n G <9D = {z \ \z\ = 1} and corresponding unit trial vectors, 
y n , so that 

\\(A n -z n )y n \\ <de- c * n (1.12) 

for all n with Ci > 0. We would like to conclude that § n {z) has zeros 
near z n . 

It is certainly not sufficient that \\{A n — z n )y n \\ — > 0. For the case 
dfi(z) = d9/2ir has $„(z) = dist(l, spec(A n )) = 1, but if y n = (1 + z + 
■■■ + z n ~ 1 )/^ then - l)y n \\ = \\P n {z - l)y n \\ = n- 1 / 2 ||P n (^ - 
1)|| = n _1//2 ||l|| = rC 1 ! 2 . As we will see later, by a clever choice of y n , 
one can even get trial vectors with \\(A n — l)y n \\ = Oin^ 1 ). 

Of course, by ()1.11|) . we are really seeking some kind of bound re- 
lating ||(4 — 2 n ) _1 || to dist(z n , spec(4))- At first sight, the prognosis 
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for this does not seem hopeful. The n x n matrix, 

/0 1 0\ 



'•• 1 
\0 0/ 



;i.i3) 



has 



Uz-Nn)- 1 ]] > \z\~ n (1.14) 

since (z — N n )~ l = Y^=o z ~^~ X {^nY has z~ n in the 1, n position. Thus, 
as is well known, \\(A n — z)~ 1 \\ for general n x n matrices A n and 
general z cannot be bounded by better than dist(z, spec(v4 n ))~ n . In- 
deed, the existence of such bounds by Henrici [4J is part of an exten- 
sive literature on general variational bounds on eigenvalues. Trans- 
lated to a variational bound, this would give dist(z n , {zeros of < 
C||(A n — z^yW 1 ^, which would not give anything useful from (|1.12j) . 

We note that as n — > oo, there can be difficulties even if zq stays 
away from spec(v4 n ). For, by ()1.14|) . 

\\{l-2N n )- l \\>T- 1 (1.15) 

diverges as n — > oo even though ||2iV n || is bounded in n. 

Despite these initial negative indications, we have found a linear 
variational principle that lets us get information from ()1.12j) . The key 
realization is that z n and \\A n \\ are not general. Indeed, 

\Zn\ = |K|| = 1 (1.16) 

It is not a new result that a linear bound holds in the generality 
we discuss. In [TT], Nikolski presents a general method for estimating 
norms of inverses in terms of minimal polynomials (see the proof of 
Lemma 3.2 of jTTj) that is related to our argument in Subsection 
His ideas yield a linear bound but not with the optimal constant we 
find. 

Our main theorem is 

Theorem 1. Let A4 n be the set of pairs (A, z) where A is an n x n 
matrix, z £ C with 

\z\ > \\A\\ (1.17) 

and 

z i spec(A) (1.18) 

Then 

c(n) = sup dist(z,spec(A))\\(A - zV l \\ = cot( — ] (1.19) 
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Of course, the remarkable fact, given (jl.l4j) . is that c(n) < oo when 
we only use the first power of dist(z, spec^)). It implies that so long 
as (ITTTf) holds, 

dist(z,spec(A)) < c(n) || {A - z)y\\ (1.20) 

for any unit vector y. For this to be useful in the context of (jl.12)) . we 
need only mild growth conditions on c(n); see (jl.21j) below. 
As an amusing aside, we note that 

c(l) = 1 = + ^ 

c(2) = 1 + V2 

c(3) = 2 + V3 

but the obvious extrapolation from this fails. Instead, because of prop- 
erties of cot(x), 



cm) < — n 

IX 

c(n) 4 
is monotone increasing to — 

n 7r 



1.21) 



so, in fact, for n > 3, 



2 + y/3 c{n) 4 



3 n ix 

a spread of 2.3%. 

We note that, by replacing A by A/z and z by 1, it suffices to prove 



sup dist(l, spec(A))||(l - A)^\ 

U\\<i 



<0t l£ 



1.22) 



and it is this that we will establish by proving three statements. We 
will use the special n x n matrix 



/I 





\0 



2 \ 

2 



1/ 



(1.23) 



given by 



(M, 




Our three sub-results are 
Theorem 2. ||M n || = cot(7r/4n) 
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Theorem 3. For each < a < 1, there exist n x n matrices A n (a) 
with 

|K(a)||<l spec(A n ) = {a} 

and 

\iui{l-a){l- A n (a)Y l = M n (1.25) 

Theorem 4. Let A be an upper triangular matrix with ||A|| < 1 and 
1 ^ spec^). Then 

( 2 if k < £ 

dist(l,spec(A))|(l - A)ll\ < \ 1 ifk = £ (1.26) 

Proof that Theorems 2-4 =^ Theorem 1. Any matrix has an orthonor- 
mal basis in which it is upper triangular: One constructs such a Schur 
basis by applying Gram-Schmidt to any algebraic basis in which A has 
Jordan normal form. In such a basis, (jl.26j) says that 

dist(l,spec(A))||(l -A)- l y\\ < \\M n y\\ < \\M n \\ \\y\\ 

so Theorem 2 implies LHS of (jl.22|) < cot(7r/4n). 

On the other hand, using A n (a) in dist(l, spec(A))||(l— A) _1 || implies 
LHS of (fT~22l > cot(7r/4n). We thus have (IT^2l and, as noted, this 
implies (031) . □ 

To place Theorem 1 in context, we note that if \z\ > \\A\\, 

oo 

no* - < E i^r i_1 ii^iP = (N - ii^ii) -1 (!- 27 ) 

3=0 

So ()1.19|) provides a borderline between the dimension-independent 
bound p.27|) for \z\ > \\A\\ and the exponential growth that may hap- 
pen if \z\ < \\A\\, essentially the phenomenon of pseudospectra which 
is well documented in [21]; see also |To] . 

The structure of this paper is as follows. In Section |21 we will prove 
Theorem 4, the most significant result in this paper since it implies 
c(n) < oo and, indeed, with no effort that c(n) < In. Our initial 
proofs of c(n) < oo were more involved — the fact that our final proof 
is quite simple should not obscure the fact that c(n) < oo is a result 
we find both surprising and deep. 

In Section H2 we use upper triangular Toeplitz matrices to construct 
A n (a) and prove Theorem 3. Sections E| and 03 prove Theorem 2; indeed, 



(1.24) 
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we also find that if 

( 1 if k < t 
(Qn(a)) H =la ifk = £ (1.28) 
[o iffc>£ 

then 

llwi)l =^4) (L29) 

which means we can compute ||Q„(a)|| for a = 0, |, 1. While the cal- 
culation of ||M n || and ||Q n (l)|| is based on explicit formulae for all the 
eigenvalues and eigenvectors of certain associated operators, we could 
just pull them out of a hat. Instead, in Section |U we discuss the moti- 
vation that led to our guess of eigenvectors, and in Section explicitly 
prove Theorem 2. 

Section |U] contains a number of remarks and extensions concerning 
Theorem 1, most importantly to numerical range concerns. Section [7| 
contains the application to random OPUC. 

Acknowledgments. This work was done while B. Simon was a vis- 
itor at King's College London. He would like to thank A. N. Press- 
ley and E. B. Davies for the hospitality of King's College, and the 
London Mathematical Society for partial support. The calculations 
of M. Stoiciu [2H E] were an inspiration for our pursuing the esti- 
mate we found. We appreciate useful correspondence/discussions with 
M. Haase, N. Higham, R. Nagel, N. K. Nikolski, V. Totik, and L. N. Tre- 
fethen. 

2. The Key Bound 

Our goal in this section is to prove Theorem 4. A is an upper tri- 
angular n x n matrix. Let Ai, . . . , A n be its diagonal elements. Since 

n 

det{z-A) = Y[{z-X j ) (2.1) 

i=i 

the X/s are the eigenvalues of A counting algebraic multiplicity. In 
particular, 

sup |1 - Xjl' 1 = dist(l,spec(A))" 1 (2.2) 

3 

Define 

C = (1- A)- 1 + (1- A*)' 1 -1 (2.3) 
Proposition 2.1. Suppose \\A\\ < 1. Then 
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(a) 

^•Hi-A^i-M 2 ) 

<2|1-A i |" 1 (2.4) 

(b) 

C > 

(c) 

\C jk \ < \C jj \ 1 ' 3 \C kk \ 1 ' 2 (2.5) 
(d) Ifj<k, then(l-A)ji = C jk . 

Proof, (a) Since A is upper triangular, 

[(l-A)- 1 ] jj = (l-\ j )- 1 (2.6) 

so ()2.4|) comes from 

(1 - A,)- 1 + (1 - A,)- 1 -1 = 11 — A.-Hl - |A/) (2.7) 

and the fact that for |A| < 1, 

|l-A|- 1 (l-|A| 2 ) = (l + |A|)(l-|A|)(|l-Ar 1 ) 
< 2 

since 1 - |A| < |1 - A|. 

(b) The operator analog of (J2.7)) is the direct computation 

C = [(1 - A)' 1 }*^ - A*A){1 - A)' 1 > (2.8) 

since \\A\\ < 1 implies A* A < 1. 

(c) This is true for any positive definite matrix. 

(d) (1 — A*)' 1 is lower triangular and 1 is diagonal. □ 

Proof of Theorem 4- (1 — A)^ 1 is upper triangular so [(1 — A)^ 1 ]ke = 
ifk>L By (EU) and Q , 

|[(1 — ^L)- x ]**| = ll-A^- 1 <dist(l,spec(A))- 1 (2.9) 

By (a), (c), (d) of the proposition, if k < £, 

i[(i - Ar'u < [|i - A fc r 2 |i - A,r 2 (i - ia,i 2 )(i - \m 2 )} 1 ' 2 

<2[|l-A fc |- 1 |l-A £ |- 1 ] 1 / 2 
< 2[dist(l,spec(A))]" 1 

by O- □ 
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3. Upper Triangular Toeplitz Matrices 

A Toeplitz matrix pQ is one that is constant along diagonals, that is, 
Ajk is a function of j — k. An n x n upper triangular Toeplitz matrix 
(UTTM) is thus of the form 



I do d\ (22 ... CL n -l\ 
Oo Cbl ••• Q-n-2 

\0 ••• a 7 



(3.1) 



These concern us because M n is of this form and because the operators, 
A n (a), of Theorem 3 will be of this form. In this section, after recalling 
the basics of UTTM, we will prove Theorem 3. Then we will state some 
results, essentially due to Schur [TH] , on the norms of UTTM that we 
will need in Section El in one calculation of the norm of M n . 

Given any function, /, which is analytic near zero, we write T n (f) 
for the matrix in (|3.1j) if 

f(z) =a + a lZ + --- + a n ^z n - x + 0{z n ) (3.2) 

/ is called a symbol for T n (f). 
We note that 

T n (fg) = T n (f)T n (g) (3.3) 

This can be seen by multiplying matrices and Taylor series or by ma- 
nipulating projections on £ 2 (see, e.g., Corollary 6.2.3 of [T7j). 
In addition, if / is analytic in {z \ \z\ < 1}, then 

||T„,(/)||<sup|/(z)| (3.4) 

\z\<l 

To see this well-known fact, associate an analytic function 

v(z) = v + v\z H (3.5) 

to the vector <p n (v) G C n by 

Lp n (v) = « n _ 2 , • • • ,V ) T (3.6) 

and note that with || • ||2, the H 2 norm, 

||y?„(v)|| = inf{||u|| 2 | ip n = Pn(v)} (3.7) 

T n {f)<p n {v) = ifnifv) (3.8) 

and 

HM| 2 <imUMl2 (3.9) 

If N n is given by f)l. 13|) . then T n (f) = f(N n ), so an alternate proof 
of ()3.4|) may be based on von Neumann's theorem; see Subsection EE. 
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Proof of Theorem 3. For a with < a < 1, define 



fa(z) 



z + a 



1 + az 
and define 

A n (a) = T n Ua) 
Then f a (e ie ) = e ie (1 + ae ie )/(l + ae id ) has \f a (e ie ) 
supui <1 |/ a (z)| = 1 and thus, by ()3.4|) . 

IK(a)||<l 

By (EH), 

spec(A n (a)) = {/ a (0)} = {a} 

By TO, 

(1 - A^a))- 1 = T n ((l - Uz))- 1 ) 

Now 

_2 z + a 



(l-a)(l -/„(*))" 



so 



l-z 

1 + 2 



Thus, 



hm(l-a)(l-/ a (z))- 1 

ofl i — Z 

lim (1 - a)(l - ^(a))" 1 = T n f ill) 

at 1 \ 1 — £/ 



3.10) 

3.11) 
1, so 

3.12) 

3.13) 
3.14) 
3.15) 
3.16) 

3.17) 



since (l + z)/{l-z) = 1 + 2z + 2z 2 + ■ ■ ■ . □ 

We now want to refine (|3.4|) to get equality for a suitable /. A key 
role is played by 

Lemma 3.1. Let a G D and A an operator with a^ 1 spec(A). 
Define 

B = (A- a)(l -aA)- 1 (3.18) 

Then 

(1) ||5||<1^P||<1 (3.19) 

(2) ||£|| = 1 ^ \\A\\ = 1 (3.20) 

Proof. By a direct calculation, 

1 - B*B = (1 - aA*)- 1 ^ - |a| 2 )(l - A*A)\{1 - aA)- 1 (3.21) 

fITm follows since 1 - B*B > & 1 - AM > 0, and (l3~20l follows 
since ()3.21|) implies 

inf fo>, (1 - 5*5)^) = & inf (</?, (1 - A*A)ip) = □ 
II v II =i llvll=i 

Remark. This lemma is further discussed in Subsection Utl. 
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Theorem 3.2. If A is annxn UTTM with \\A\\ < 1, then there exists 
an analytic function, f , on B such that 



sup |/0) 

\z\<l 



< 1 



and 



A = T n (f) 



(3.22) 



(3.23) 



Proof. The proof is by induction on n. If n 



|«o| 

Kl 



< 1 and we can take f(z) 

< 1- if N 



1, \\A\\ < 1 means 
= Oq. For general n, \\A\\ < 1 means 
1, then A = a l and we can take f(z) = a . If 
a < 1, define B by ()3.18|) with a = a . B is a UTTM with zero 
diagonal terms, so 

/o 

B = 



v° 



(3.24) 



where ||5|| = \\B\\ < 1 by the lemma. 

By the induction hypothesis, B = T n _i(g) where 

sup \g(z)\ < 1 

\z\<l 

Then (EOHl) holds with 

/ 



a + zg 



(I3~23jl and (j3~25jl imply (JS22I). 



1 + a zg 



(3.25) 

(3.26) 
□ 



Remarks. 1. By iterating / — > g, we see that one constructs / via the 
Schur algorithm; see Section 1.3 of [T7j . 

2. Combining this and (|3.4|) . one obtains Schur's celebrated result 
that ao + a\z + • ■ • + a n ^\z n ^ 1 is the start of the Taylor series of a 
Schur function if and only if the matrix A of ()3.1|) obeys A* A < 1. 
This result is intimately connected to Nehari's theorem on the norm of 
Hankel operators [HI EH]; see Partington [T2"] . 

3. This is classical; see [H ITUI ITS]. 



To state the last result of this section, we need a definition: 

of the form 

f(z,w) = i—^- (3.27) 



Definition. A Blaschke factor is a function on 

z — w 



1 — wz 

where w G D. A (finite) Blaschke product is a function of the form 

k 

f(z)=ul[f(z,w k ) (3.28 
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where u G <9D. k is called the order of /. We allow k = 0, in which 
case f(z) is a constant value in <9D. 

Theorem 3.3. An n x n UTTM, A, has \\A\\ = c if and only if A = 
T n (f) for an f so that c~ 1 f is a Blaschke product of order k < n — 1. 

Proof. (See as alternates: [HIIIIB].) Without loss, we can take c = 1. 
The proof is by induction on n. If n = 1, k must be 0, and the theorem 
says \a \ = 1 if and only if /(0) = to G <9B, which is true. 
It is not hard to see that if / and f\ are related by 

aw = .- m^m 

l-/(0)/(z) 

then / is a Blaschke product of order > 1 if and only if f\ is a 
Blaschke product of order k — 1. 

Given A a UTTM with ||A|| < 1, |a | = 1 if and only if A = T n (ao), 
that is, A is given by a Blaschke product of order 0. If |ao| < 1, we 
define B by (jTTSj) . ||S|| = 1 if and only if \\A\\ = 1. B given by 
(|3.25|) is related to A by A = T n (/) if and only if B = T n _ 1 (/ 1 ). Thus, 
by induction, \\A\\ = 1 if and only if / is a Blaschke product of order 
k < n - 1. □ 

4. Inverse of Differential/Difference Operators 

In this section and the next, we will find explicit formulae for the 
norms of M n and Q n = Q n {l) given by (jl.28|) . Indeed, we will find all 
the eigenvalues and eigenvectors for \M n \ and \Q n \ where \A\ = v A* A. 
A key to our finding this was understanding a kind of continuum limit 
of M n : Let K be the Volterra-type operator on TC = L 2 ([0, 1], dx) with 
integral kernel 

'\ < x < y < 1 
< y < x < 1 

In some formal sense, K is a limit of either M n or Q n , but in a precise 
sense, M„ is a restriction of K: 



K(x,y) 



Proposition 4.1. Let n n be the projection of 7i onto the space of 
functions constant on each interval [^, ^—), j = 0, 1, . . . , n — 1. Then 

7T n K7T n (4.1) 

is unitarily equivalent to ^M n /n. In particular, 

\\M n \\ < 2n\\K\\ (4.2) 

lim i!^yi = 2||A'|| (4.3) 

n^oo n 
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Proof. Let {/j n) }"~o be the functions 

f) n \x) = { v n - n (4.4) 
I otherwise 

which form an orthonormal basis for Ran(7r n ). Since 

n{ff l \KfP) = \{M n ) jk (4.5) 

we have the claimed unitary equivalence. (J4.2)) is immediate from 
||7r ri i'r7r ri || < (|4.3jl follows if we note s-lim n _»oo 7r n = 1, so 

lim ||7r n .K"7r n || = \\K\\. □ 

Notice that 

(Kf)(x)= I f(y)dy (4.6) 

J X 

SO 

^ (Kf) = f Kf(l) = (4.7) 

and K is an inverse of a derivative. That means K*K will be the 
inverse of a second-order operator. Indeed, 

(K*K)(x,y)= [ K(z^x)K(z,y)dz 
Jo 

™min(a;,j/) 

dz 

= mm(x, y) (4.8) 

which, as is well known, is the integral kernel of the inverse of — 
with it(0) = 0, u'(l) = 1 boundary conditions. 

We can therefore write down a complete orthonormal basis of eigen- 
functions for K*K: 

<p n (x) = sin(| (2n - l)irx) n=l,2, ... (4.9) 

so 

\\K\\ = \\K*K\\ 1/2 = - (4.11) 

7T 



By fjOjl . (|43jl . we have 
Corollary 4.2. 



An 

\M n \\ < — (4.12) 

7T 
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(4.13) 

ri-+oo n 7T 

Of course, we will see this when we have proven Theorem 2, but it 
is interesting to have it now. 

While M n is related to differential operators via ()4.5|h we can com- 
pute the norm of Q n by realizing it as the inverse of a difference oper- 
ator. Specifically, let N n be given by ()1.13|) . Then 

1 = 1 + N n + Nl + ■ • • + AT?" 1 



(1 - N n ) 
Theorem 4.3. Let 

D n = (1 - N n )(l - N n ) 
Then D n has a complete set of eigenvectors: 



sin 



7T(2l+l)j 

2n + 1 
\2(2n + l 

WQn 



min eigenvalue of D n ) 
-l 

/ 7T \ 

2 sin 



\4n + 2 

Proof. By a direct calculation, 

/ 2 -1 
-1 2 -1 
0-1 2 



V 



(4.14) 

(4.15) 

0,...,n-l (4.16) 
(4.17) 

(4.18) 



2-1 
-1 2 -1 
0-1 1 / 



(4.19) 



is a discrete Laplacian with Dirichlet boundary condition at and 
Neumann at n. Since 

- sin(g(j + 1)) + 2 sin(gj) - sin(g(j - 1)) = 4 sin 2 sin(gj) 

f!4.16|) / ()4.17|) hold so long as q is such that sin(g(n + l)) = sin(gn), that 
is, 

\ [q(n + 1) + qn] = (£ + |)tt 
or g = (2£+ l)7r/(2n+ 1). □ 
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Remark. For OPUC with dfi = d6/2ir, in the basis 1, z, . . . , z n ~ l , A n is 
given bv the matrix. N n , of (|1.13j) . and so || (1— iV"^, ) ~ 1 1 1 = ||<5n|| ~ 2n/n. 
Thus, there are unit vectors, y n , in this case with ||(1 — A n )y n || ~ n/2n. 

5. The Norm of M n 

In this section, we will give two distinct but related proofs of Theo- 
rem 2. Both depend on a generating function relation: 

Theorem 5.1. For 6 6 (0, tt) and z £ D, define 

oo 

^(z) = ^sin((2j + l)0y 



3=0 



C e (^) = ^cos((2j + l)^); 

3=0 



Then 



l + z 
1-z 



C e {z) = cot{6)S e {z) 



(5.1) 
(5.2) 

(5.3) 



Proof. Let uo = e so, summing the geometric series, 



So(z) 



(2i)- 1 E ( 



3=0 



(2z 



UJ 



1 — ZCo> 2 1 — ZUJ 2 

sin(0)(l + z) 



(5.4) 

(1-^ 2 )(1-^ 2 ) ^ 

For C7 w (z), the calculation is similar; in (|5.4|) . (2i) _1 is replaced by (2) _1 
and the minus sign becomes a plus: 

cos(0)(l-z) 



(1531) and flSSJ imply (Q. 



(1 -^ 2 )(1 -zu; 2 ) 



(5.6) 
□ 



Our first proof of Theorem 2 depends on looking at the Hankel matrix 

(2 2 ... 2 1\ 
2 2 ... 10 



\1 ... 0/ 

If W n is the unitary permutation matrix 

(Wv)j = fn+l-j 



(5.7) 



(5.? 



M n cW = cot([2£+l)^-)cW 
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then 

M n = M n W M n = M n W (5.9) 

and so 

||M n || = ||M n || (5.10) 
Here is our first proof of Theorem 2: 

Theorem 5.2. Let 

cf ; ' } = cos^2£+0|-(2j-l)^ j = l,2,...,n; £ = 0,...,n-l 

5.11) 

5.12) 
5.13) 

5.14) 

5.15) 

5.16) 
from 



Then 



Thus, 



Proof. Let 



and 



>;*) 



sf-> 6) = sin(9(2j - 1)) j = l,...,n 



2J2nJ 
\M n \\ = \\M n \\ = cot^J 



cos(9(2j-l)) j = l,2,...,n 



j 



Then (|5.3p implies that 

M n Wc {n ' e) = cot(9)Ws (n ' e) 

by looking at coefficients of 1, z, . . . , z n ~ x . The W comes 
dHSD/dlHl). If 



6 = - + 2£tt £ = 0,...,n-l 
2 



5.17) 
5.18) 



then 

and ()5.16|) becomes ()5.12|) . 

Since M is self-adjoint, ()5.13j) follows from ()5.12j) either by noting 
that max|cot((2£+ |)^)| = cot(^) or by noting that c (n;e=7r/4n) is a 
positive eigenvector of a positive self-adjoint matrix, so its eigenvalue 
is the norm by the Perron-Frobenius theorem. □ 

Our second proof relies on the following known result (see Milovanic 
et al. t 5j, page 272, and references therein; this result is called the 
Enestrom-Kakeya theorem; see also Polya-Szego jHj, problem 22 on 
pp. 107 and 301, who also mention Hurwitz): 
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Lemma 5.3. Suppose 

< ao < ci\ < ■ ■ ■ < a n (5.19) 

Then 

P(z) = a + aiz H h a n z n (5.20) 

has all its zeros in D. 

Theorem 5.4. Let 



Then 



3=0 V ' 
n-1 / \ 

C-<»>(z) = £cos((2j + l)^J 2 ' (5.22) 



25 a Blaschke product of order n — 1 . Moreover, 



cot( ^)b n {z) = I + 2j2 zj + 0{z n ) (5.24) 
V n / i= i 

and 

l|MJ=COt (£) (5 - 25) 

Proof. The coefficients of obey (|5.19jl so, by the lemma, has 
all its zeros in D. Moreover, by (JOty . C (ri) (z) = z n S^l/z), which 
implies (|5.23|) is a Blaschke product. 

(I5~2"4l is just a translation of (JOJ). (I5~24l) implies (l5~2"5T) by Theo- 
rem □ 

6. Some Remarks and Extensions 

In this section, we make some remarks that shed light on or extend 
Theorem 1, our main result. 

A. An alternate proof. We give a simple proof of a weakened version 
of Theorem 4 but which suffices for applications like those in Section [7| 
This argument is related to ones in Section 3 of Nikolski ^T] . 

Theorem 6.1. If \\A\\ < 1 and 1 ^ spec(A), then 

dist(l,spec(A))||(l - Ay 1 ]] < 2m (6.1) 
where m is the degree of the minimal polynomial for A. 



n-l 
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Proof. We prove the result for \\A\\ < 1. The general result follows by 
taking limits. We make repeated use of Lemma f3. II which implies that 
if, for A G D, and we define 



B(X) 



A-X 



1 - A 



then 



By algebra, 



A-x) 



1 - XAJ V 1 - A 
B(A)||<1 



x — X fl — X 



X 



-( x — X 
1 + A 



1-xX 



1-Xx\l-X 
so, by Lemma 13. II again. 

||(1-A)- 1 (1-5(A))||<|1-A|- 1 (1 + |A|) 
Now let nj=i( x — -\?°) be the minimal polynomial for A. Then 



(6.2) 
(6.3) 

(6.4) 
(6.5) 



3=1 







so 



[1 - A)- 1 = {1 - A)- 1 



3=1 



^{l-Ay^i-Bjix)] n B„{\) 

3=1 k=j+l 



(6.6) 



(the empty product for j = m is interpreted as the identity operator) 
which, by (|6.3|) and (|6.5|) . implies 

m 

LHS of (jnil) < ^dist(l,spec(A))|l - A^ -1 ^ + \X 3 }) 
< 2m 

since 1 + |A/I < 2 and Xj G spec(A) so dist(l, spec(v4))|l — Aj| _1 < 1. □ 

Remarks. 1. The factor (1 — A)/(l — A) is taken in ()6.2|) so f\{z) = 
(z - A)(l - AYr^l - A)(l - A)" 1 has 1 - / A (l) = 0. 

2. In place of the algebra (J6.4)) . one can compute that the 
sup| z | <1 LHS of (|6.4j) is 1 1 — A| _1 [l + |A|] and use von Neumann's theo- 
rem as discussed in Subsection E below. 
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B. Minimal polynomials. While the constant 2 in (j6.1j) is worse 
than 4/n in (|1.19Jl / (jl.21j) . (J6.1)) appears to be stronger in that m, not 
n, appears, but we can also strengthen (jl.l9|) in this way: 

Theorem 6.2. If \\A\\ < 1, 1 ^ spec (A), and m is the degree of the 
minimal polynomial for A, then 

dist(l,spec(A))||(l - A)" 1 1| <cot^^-^ (6.7) 

Proof. Let \\y\\ = 1. Since A m y is a linear combination of {A^y} 1 ^ , 
the cyclic subspace, V y , has dim(V^) = m y < m. Since A \ V y is an 
operator of a space of dimension m y , we have 

dist(l, spec(A))||(l - v4) _1 y|| < c{m y ) = cotj J 

< cot ( — ) □ 
\4m J 

C. Numerical range. For any bounded operator, A, on a Hilbert 
space, the numerical range, Num(^4), is defined by 

Num(A) = {(cp,Acp) | \\cp\\ = 1} (6.8) 

It is a bounded convex set (see p. 150]), and when A is a finite 
matrix, also closed. Theorem 1 can be improved to read: 

Theorem 6.3. Let M. n be the set of pairs (A, z) where A is an n x n 
matrix, zGC with 

z <£ spec(A) z i Num(A) int (6.9) 

Then 

sup dist(z,spec(A))||(A- z) _1 || = cot(^-) (6.10) 



Remarks. 1. Since Num(v4) C {z \ \z\ < \\A\\}, M. n c A4 n , and this is 
a strict improvement of (jl,19|) . 

2. We need only prove 

dist(z,spec(A))\\(A- zy 1 ]] < cot C^A 

since the equality then follows from M. n c M. n . 

3. By replacing A by e ld (A — z) for suitable 6 and z, we need only 
prove 

Re(A) > 0, i spec(A) =>■ dist(0, spec(A)) || A^ 1 1| < cotf^ (6.11) 
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for by convexity of Nuhi(t4), if z ^ Num(74) mt , there is a half-plane, P, 
with Num^) C P and z G dP. It is (16. lip we will prove below. 

First Proof of Theorem \6.H\ Let 

C = A~ 1 + {A*)- 1 (6.12) 

= {A*y 1 2Re{A)(A)- 1 >0 (6.13) 

Thus, 

\C jk \ < \C 3J \ l ' 2 \C kk \ 1 ' 2 (6.14) 

Now just follow the proof of Theorem 4 in Section [21 □ 

Second Proof of Theorem \6.!A We use Cayley transforms. For < s, 
define 

B(s) = (1 - sA)(l + sA)- 1 (6.15) 

Since 

||(1 + sA)ip\\ 2 - ||(1 - sA)ipf = 4s Re(<p,Aip) > 
we have that 

||B(*)||<1 (6.16) 

Because 

1 - B(s) = 2sA{\ + sA)' 1 (6.17) 
we have for s small that 

dist(l, spec(£(s))) = 2s dist(0, spec(A)) + 0(s 2 ) (6.18) 
Thus, by Theorem 1, 

2sdist(0,specM))||(l - B(s)y l \\ <cot( — ) + Ots) (6.19) 

\4n / 

By (EIZl), 

(l-5(s))- 1 = (2s)- 1 [A- 1 + s] 

so 

< |s| + 2s||(l - Bis))- 1 ]] (6.20) 
This plus dnHBD implies flFTTT]) as s | 0. □ 
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D. Bounded powers. We note that there is also a result if 

sup \\A m \\ = c < oo (6.21) 

m>0 

We suspect the 3/2 power in the following is not optimal. We note 
that one can also use this method if ||^4 m || is polynomially bounded in 
m. 

Theorem 6.4. If flQH) holds, then 

11(1 - < c(3n) 3/2 dist(l,spec(A))- 3/2 (6.22) 

Proof. By the argument of Section^ (using (jl.lljl ). this is equivalent 
to 

dist(l,spec(A)) < 3ra(c||(l - A)y\\f , ' i (6.23) 

for all unit vectors y. 
Define for 1 < r, 

oo 

(f,g) r = Y,r- 2m (A m f,A m g) (6.24) 

m=0 



By dS2TJ), 



11/11 <l 


l/H. < cr(r 2 - 


11/11 


(6.25) 


By flo^2|), 


\\Af\\r<r 2 \\f\\ 2 r 




(6.26) 


so 


Nlr<r 




(6.27) 


so if C = r~ 1 A, then 


l|C||r < 1 




(6.28) 


Clearly, for ||y| = 1 < | 


V\\r, 






\\Cy-y\\ r < k- 1 - 1 


IHl/Hr + r-'IKA-l^ 


'llr 




< |r _1 - 1 


||| 2/ || r + c ( r 2_ l) -l/2 


11(^4 — 1)1/11 




<((r-l) 


+ c[2(r - 1)}-^\\(A 


-i)y||)l|y||r 


(6.29) 



It follows by Theorem 1 and the fact that spec(v4) is independent of 
|| ■ || r . that 

Art . 

dist(l, r-^pecfA)) < — {c\\ (A - l)y\\ (2(r - 1))^ 1/2 + (r - 1)} (6.30) 

7T 

and thus 

dist(l, spec(A)) < (r - 1) + — {c|| (A - l)y\\ (2(r - 1)) _1/2 + (r - 1)} 

(6.31) 
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Choosing r = l + |(c||(A— l)y||) 2//3 and using § + — < 3n, we obtain 

E. Von Neumann's theorem. Lemma l3~T1 is a special case of a theo- 
rem of von Neumann. The now standard proof of this result uses Nagy 
dilations we have found a simple alternative that relies on 

Lemma 6.5. For any A, with ||v4|| < 1 and A = U\A\, and U unitary, 
there exists an operator-valued function, g, analytic in a neighborhood 
of 3 so that g(e t6 ) is unitary and g(0) = A. 



Proof. Let 

z+\A 

g(z) = U 



(6.32) 



l + z\A\ 

The factor in [. . . ] is unitary if z = e 10 , since 

(e t6 + \A\)*{e if> + \A\) = 1 + A* A + 2cos0L4| 

= (l + e w \A\)*(l + e i9 \A\) □ 

Theorem 6.6 (von Neumann |2H]). Let f : D — > D. < 1, define 

f(A) by 

oo oo 

f(z) = *nz n f(A) = a nA n (6.33) 



n=0 n=0 



Then 



\\f(A)\\ <1 (6.34) 



Proof of von Neumann's theorem, given the lemma. Suppose first 
that A obeys the hypotheses of the lemma. By a limiting argument, 
suppose / is analytic in a neighborhood of D. Applying the maximum 
principle to f(g(z)), we see 

11/(^)11 = WfW))\\< sup \\f(g(e w ))\\ 

e 

= sup|/(e*)|<l (6.35) 
e 

where ()6.35|) uses the spectral theorem for the unitary g(e l9 ). 

For general A, if A = A®0 onH®H, then A = U\A\ with U unitary 
and we obtain ||/(A)|| < 1. But f(A) = f{A) © 0. □ 

Remarks. 1. In general, A = V\A\ with V a partial isometry. We can 
extend this to a unitary U so long as dim(Ran(V) ± ) = dim(ker(V) ± ). 
This is automatic in the finite-dimensional case and also if dim(7-^) = oo 
for A © since then both spaces are infinite-dimensional. 
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2. This proof is close to one of Nelson |H| who also uses the maxi- 
mum principle and polar decomposition, but uses a different method 
for interpolating the self-adjoint part (see also Nikolski [10J). 

7. Zeros of Random OPUC 

In this section, we apply Theorem 1 to obtain results on cer- 
tain OPUC. We begin by recalling the recursion relations for OPUC 
[T7| H%1 IT!?] . For each non-trivial probability measure, dp, on <9B, there 
is a sequence of complex numbers, {a n (dp)}^L , called Verblunsky co- 
efficients so that 

$ n+ i(z) = z$ n (z) - a n $* n (z) (7.1) 

where 

= ^COM (7.2) 
The a n obey \a n \ < 1 and Verblunsky's theorem [T7J HH] says that 
p i — > {a n (dp)}^ =0 is a bicontinuous bijection from the non-trivial mea- 
sures on <9B with the topology of vague convergence to ©°° with the 
product topology. 

For each p in (0,1), we define the p-model to be the set of ran- 
dom Verblunsky coefficients where a n are independent, identically dis- 
tributed random variables, each uniformly distributed in {z \ \z\ < p}. 
A point in the model space of a's will be denoted ur, § n (z;uj) will be 
the corresponding OPUC and {z] n ' ) (ti;)}" =1 the zeros of $ n counting 
multiplicity. Our results here depend heavily on earlier results of Sto- 
iciu 1213 HI], who studied a closely related problem (see below). In 
turn, Stoiciu relied, in part, on earlier work on eigenvalues of random 
Schrodinger operators [TJE]. 

We will prove the following three theorems: 

Theorem 7.1. Let < p < 1. Let k 6 {1,2,...}. Then for a.e. uj in 

the p-model, 

, #{j|K ( "V)|<l-n-n 

hmsup J — — -— < oo (7.3) 

n^oo [log(n)f 

Thus, the overwhelming bulk of zeros are polynomially close to <9B. 
If we look at a small slice of argument, we can say more: 

Theorem 7.2. Let < p < 1. Let 9 e [0, 2ir) and a < b real. Let 
i] < 1. Then with probability 1, for large n, there are no zeros in 
{z \ aigz e (0 O + , + ^); \z\ < 1 - exp(-n")}. 

Finally and most importantly, we can describe the statistical distri- 
bution of the arguments: 
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Theorem 7.3. Let < p < 1. Let 6 E [0,2tt). Let ai < &i < 

a 2 < b 2 < ■ ■ ■ < ae < be and let k±, . . . , kg be in {0, 1, 2, ... }. Then as 
n — > oo, 



Prob(# 



3 



aig^ n) (u;) G ( O +^^ , #0+^^ ) j = />•„, ,/or /// = I . 



(7.4) 



converges to 



J-j- (fom Qm) fcm e -(6 TO -a m ) ^ 



m=l 



A: 



This says the zeros are asymptotically Poisson distributed. As we 
stated, our proofs rely on ideas of Stoiciu, essentially using Theorem 1 
to complete his program. To state the results of his that we use, we 
need a definition. 

For (3 G 83, the paraorthogonal polynomials (POPUC) are defined 

by 

^\z) = ^ 1 {z)-m n -i{z) (7.6) 

These have zeros on 83. Indeed, they are eigenvalues of a rank one 
unitary perturbation of the operator A n of (jl.6j) . We extend the p- 
model to include an additional set of independent parameters {/3j}£L 
in 83, each uniformly distributed on 83. z^ n \uj) denotes the zeros of 

® { J? n \z;u). Stoiciu [HHni completely analyzed these POPUC zeros. 
We will need three of his results: 

Theorem 7.4 (= Theorem 6.1.3 of [2T] = Theorem 6.3 of [201)- Let I 

be an interval in 83. Then 



Prob(2 or more z) (u) lie in I) < - I (7.7) 

J 2\2nJ 

where \I\ is the d9 measure of I. 

For the next theorem, we need the fact that there is an explicit 
realization of A n and the associated rank one perturbations as n x n 
complex CMV matrices (see [21 HZ| HE1 EH1), C n , whose eigenvalues are 
the Zj, and Cn^ whose eigenvalues are the zj, so that 

\\(C n -C^) V \\ < K-i| + |<^| (7.8) 
The next theorem uses the components so ()7.8)1 holds. 

Theorem 7.5 (= Theorem 1.1.2 of 21J = Theorem 2.2 of [2DD- There 
exists a constant D 2 (depending only on p) so that for every eigenvector 
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^0> ; n) fC^ n \ we have for 

\m - m(y? ( ^ ;n) )| > D 2 (\ogn) (7.9) 

that 

|^T ;n) | < c^ e - 4|m - m( ^ ;n))l/D2 (7.10) 
where C u is an a.e. finite constant and 

m((p) = first k so [ip^l — max \<p m \ (7-H) 

m 

We will also need the results that Stoiciu proves along the way that 
for each C , 

{u\C u < C } = n Co (7.12) 
is invariant under rotation of the measures d/i^, and that for each Cq 
fixed and all u G Qc i 

#(j | m(ip^) = m ) < D 3 (logn) (7.13) 

where D 3 is only C -dependent and is independent of u>, m , and n. 
()7.13|) comes from the fact that, by ()7.10|) . for D 3 only depending on 

Co, 

E ^ I ( 7 - 14 ) 

|m-m(<^)|>|-D3 (logn) 

so, by (j7.11)l . for yj's with m(</?) = m , 

iDsOogn)!^! 2 ^! (7.15) 

which, given 

J2\^m \ 2 = l (7.16) 

implies (|7.13j) . 

The last of Stoiciu's results we will need is 

Theorem 7.6 (= Theorem 1.0.6 of [2T| = Theorem 1.1 of |20|). For 

6q G [0, 2tt) and a\ < b% < a 2 < b 2 < • • • < < be and k%, . . . , kg, in 
{0, 1,2,...}, we have, as n —> oo, that (|7.4|) with z^ replaced by 
converges to (|7.5|) . 

With this background out of the way, we begin the proofs of the new 
Theorems I7.1H7.3I with 

Theorem 7.7. Fix p G (0, 1). Then for a.e. uo, there exists so if 
n > Nv, then 

mm\z[ n) -4 n) | > 2n" 4 (7.17) 
Remark. n~ 3 ~ £ will work in place of n -4 . 
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Proof. For each n, cover <9B by two sets of intervals of size 4n~ 4 : one 
set non-overlapping, except at the end, starting with [0, 4n~ 4 ] and the 
other set starting with [2n~ 4 , 6n -4 ]. If (|7.17|) fails for some n, then there 
are two zeros within one of these intervals. By 1)7.7)) . the probability 
of two zeros in one of these intervals is 0((rm~ 4 ) 2 ) = 0(n~ 6 ). The 
number of intervals at order n is 0(n A ). Since Y^=i n 4 n~ 6 < oo, the 
sum of the probabilities of two zeros in an interval is summable. By 
the Borel-Cantelli lemma [22] for a.e. w, only finitely many intervals 
have two zeros. Hence, for large n, (|7.17|) holds. □ 

Proof of Theorem \7.1\ Obviously, if (|7.3J1 holds for some k, it holds for 
all smaller k, so we will prove it for k > 4. We also need only prove it on 
any Qc given by (J7.12)) since Uf2c has probability 1 by Theorem 17.51 
Consider those y>w>;n) with 

| m ^0> ; n)) _ n | > K{\ogn) (7.18) 

By f|7.13|) . the number of j for which ()7.18|) fails is 0((logn) 2 ). 

By (J7.1U)) and (|7.8|) and the fact that (p is a unit eigenfunction, then 

|| (C n - zj n V 0>;n) II < 2Cun- iK / D * (7.19) 

so picking K large enough and n large enough that ^lC^,n~ x < 1, we 
have 

\\(C n -f ) ) l p^)\\<l-n- k (7.20) 

Thus, by Theorem 1 and ||C n || = 1 = \z^\, we see that for each j 
obeying (|7.18|) . there is a so 

\zf-zf\<n k (7.21) 

By Theorem 17. 71 and k > 4, the z^ are distinct for n large, so we have 
n — 0((logn) 2 ) zeros with > 1 — rT k . This is (|7.3j) . □ 

Proof of Theorem \7.^ In place of ()7.18|) . we look for ^'s so 

| m <y^)) - n | > ^ n 1 -^ (7.22) 

For such j's, using the above arguments, there are zeros Zj with 

\zf - ~zf | < C u exp(-2n") (7.23) 



□ 
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As in Stoiciu j2Tl| !2Tj. the distribution of zf ] for which (jQ2D fails 
is rotation invariant. Since the number is 0(n l ~ T, \ogn) out of 0(n) 
zeros, the probability of any of these had zeros lying in {z \ aigz G 
(0 O + , #o + ?)} goes to zero asn^oo. 

Proof of Theorem \7.c![ By the last proof, the zeros of $ n with the given 
arguments lie within 0(e~ nV ) of those of $n and, by Theorem 17. 7| 
these zeros are distinct. Theorem 17.61 completes the proof if one gets 
upper and lower bounds by slightly increasing/decreasing the intervals 
on an 0{l/n) scale. □ 

We close with the remark about improving these theorems. While 
f!7.13|) is the best one can hope for as a uniform bound, with overwhelm- 
ing probability the number should be bounded. Thus, we expect in 
Theorem 17.11 that one can obtain (^((logn)" 1 ) in place of 0((logn)~ 2 ). 
It is possible in Theorem 17 . 21 that one can improve 0(e~ nV ) for all 77 £ 1 
to 0(e~ An ) for some A. 
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